Overcoming data scarcity with transfer learning

نویسندگان

  • Maxwell L. Hutchinson
  • Erin Antono
  • Brenna M. Gibbons
  • Sean Paradiso
  • Julia Ling
  • Bryce Meredig
چکیده

Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual differences introduce irreducible errors into analyses, fundamentally limiting their accuracy. Transfer learning, where information from one dataset is used to inform a model on another, can be an effective tool for bridging sparse data while preserving the contextual differences in the underlying measurements. Here, we describe and compare three techniques for transfer learning: multi-task, difference, and explicit latent variable architectures. We show that difference architectures are most accurate in the multi-fidelity case of mixed DFT and experimental band gaps, while multi-task most improves classification performance of color with band gaps. For activation energies of steps in NO reduction, the explicit latent variable method is not only the most accurate, but also enjoys cancellation of errors in functions that depend on multiple tasks. These results motivate the publication of high quality materials datasets that encode transferable information, independent of industrial or academic interest in the particular labels, and encourage further development and application of transfer learning methods to materials informatics problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effectiveness of Transfer Learning in Electronic Health Records Data

The application of machine learning to clinical data from Electronic Health Records is limited by the scarcity of meaningful labels. Here we present initial results on the application of transfer learning to this problem. We explore the transfer of knowledge from source tasks in which training labels are plentiful but of limited clinical value to more meaningful target tasks that have few labels.

متن کامل

Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection

The scarcity of large-scale non-native corpora and human annotations are two fundamental challenges in the development of computer-assisted pronunciation training (CAPT) systems. We explored several transfer learning based methods to detect the pronunciation errors without using nonnative training data. Effects were confirmed in the Mandarin Chinese pronunciation error detection of Japanese spe...

متن کامل

Testing the Structural Model of Job Characteristics, Organizational Climate and Extra-Organizational Factors on the Transfer of Education with the Role Mediation of Strategies Transfer

The purpose of this study was to investigate the role of job factors, constructive organizational climate and extra-organizational factors on the transfer of learning with the mediating role of learning transfer mechanisms on the consequences of learning. The research method was descriptive-survey and based on structural equations. The statistical population of the study included all managers, ...

متن کامل

Testing the Structural Model of Job Characteristics, Organizational Climate and Extra-Organizational Factors on the Transfer of Education with the Role Mediation of Strategies Transfer

The purpose of this study was to investigate the role of job factors, constructive organizational climate and extra-organizational factors on the transfer of learning with the mediating role of learning transfer mechanisms on the consequences of learning. The research method was descriptive-survey and based on structural equations. The statistical population of the study included all managers, ...

متن کامل

پیش بینی رابطه ابعاد جو یادگیری با انتقال آموزش در بیمارستان میلاد تهران و ارائه مدل

Background: Increaseing efficiency and effectiveness is the ultimate goal of staff training. Determining and being aware of staff training`s results efficiency is the necessity of learning transfer process and environmental identification which can complete the training cycle and leads to more effective plans and training activities. This study is aimed to predict the relationship between the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.05099  شماره 

صفحات  -

تاریخ انتشار 2017